Entry Name:  "IIITH-RAGHAVENDRA-MC1"

VAST Challenge 2014
Mini-Challenge 1

 

 

Team Members:

Veera Raghavendra Chikka, International Institute of Information Technology Hyderabad, raghavendra.ch@research.iiit.ac.in     PRIMARY
Kamalakar Karlapalem(Advisor), International Institute of Information Technology Hyderabad, kamal@iiit.ac.in


Student Team:  YES

 

Analytic Tools Used:

Sematic Parsing software of LUND University : Semantic parsing software using PropBank and NomBank frames
SIMILE project of MIT : Interactive tool for displaying timeline and description of events.
D3.js JavaScript library for web visualizations.
R-programming language : R is a language and environment for statistical computing and graphics.

Approximately how many hours were spent working on this submission in total?

180 hours

 

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2014 is complete? YES

 

 

Video:

http://youtu.be/5NpqgPoVilo

 

MC1_raghavendra

 

 

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Questions

MC1.1Provide a visual representation of the structure of the Protectors of Kronos network, with supporting evidence.

a.      Who are the leaders?

b.      Who is part of the extended network?

c.       How has the group structure and organization changed over time?

d.      Where are the potential connections between the minor-latin;mso-fareast-font-family:"Times New Roman";mso-hansi-theme-font: POK and GAStech?

Provide novel visualizations appropriate for communicating key information to the busy leaders of the investigation. Please limit your response to no more than eight images and 500 words.

 

Approach :

After a thorough literature survey, we have noticed different analysis strategies to attack an investigation analysis problem. The strategy we followed here for this VAST challenge is referred as "Find a clue, Follow the trail".

Our work started with skimming through the dataset (mainly Historical Documents and news reports). Historical documents has very crucial information about the initial roots of the structure of POK. Then comes the mining of information from news reports, thanks to stanford-ner tool for making our task easier. It opens each file by tagging PERSON, LOCATION and ORGANISATION attributes. Figure 1.1 shows an modified version of stanford-ner-gui tool with a tagged file and right most frame contains all PERSON entities of given news reports.

Then, we used R-programming language to analyse information from NER tagged news reports using different kinds of visualizations like adjacency matrix, word cloud, parallel coordinates to find relation-ships among different entities.

Fig 1.1: Modified Stanford NER GUI

Solution:

a. Usually, members in a government or a company have their own designations which are used to infer a person. Here in the case of Protestor of Kronos(POK) which is a private organisation have only two levels of designation, namely leader and member. So we came up with few rule-based patterns for identitfying leaders of POK. As such we have found 3 leaders of POK in different timelines - Henk Bodrogi, Elian Karel and Silvia Marek

b. We combined the information extracted from historical documents, new reports and EmployeeRecords to find the extended network of POK which includes Michale Kraft, Isia Vann and Hennie Osvaldo. The whole structure of organisation under Silvia Marek is shown in figure 1.3

c. From above analysis, we have POK leaders, their network and timeline. We provide an composite and interactive visualization which comprises of POK Leaders, their timeline and structure of POK as shown below.

Fig 1.2: POK Leaders, their timeline and structure of POK

Evidence Document : 10 year report clean of Historical Documents

Initial Stage: Formation of the Grassroots Effort
When it became clear the Council could not agree on a course of action the Elodis citizens decided to address the problem independently. At this point the SMO was in the initial stages of development, being still a group of citizens with similar concerns. The primary actors in the initial stage formed the seven founding members of the Protectors of Kronos SMO: Henk Bodrogi, Carmine Osvaldo, Ale L. Hanne, Jeroen Karel, Valentine Mies, Yanick Cato and Joreto Katell.

Evidence Document : 5 year report clean of Historical Documents

"Profile of Dominant POK Personalities" section.

d. We found few hints in ElectronicRecords of GAStech employees who had been disposed for misconduct. Mischief GAStech Employees possessing the last name same as that of the POK members are considered to be potential connections between GAStech and POK. They are marked with red colour as shown in Fig 1.3

Evidence from EmployeeRecords :

GAStech employees having last name matching with POK members belonging to "Security" and "General Discharge". That is, "Kronos" AND "Security" AND "General Discharge" AND "Last Name"

Osvaldo,Hennie,31/05/1988,Kronos,Male,Kronos,BirthNation,31/05/1988,,,,Security,Perimeter Control,07/06/2011,Hennie.Osvaldo@gastech.com.kronos,ArmedForcesOfKronos,GeneralDischarge,01/10/2010

Vann,Isia,13/12/1986,Kronos,Male,Kronos,BirthNation,13/12/1986,,,,Security,Perimeter Control,14/12/2007,Isia.Vann@gastech.com.kronos,ArmedForcesOfKronos,GeneralDischarge,01/10/2007

Fig 1.3: POK structure under leadership of Silvia Marek. The level of red mark indicates the potential connection of member with GAStech organisation.

 

 

 

 

 

MC1.2Describe the events of January 20-21, 2014. What is the timeline of events? Please limit your response to no more than ten images and 500 words.

 

Solution:

Data Preprocessing :

We have collected all the news reports (about 265 entries) published on january 20 and january 21 into a separate workspace. Few news report doesnot have published time information. So by making a basic assumption that two news reports having exactly same information must be published at a time, we found published time of unknown articles. We used TF-IDF weighing Vector space model to find the similar articles to obtain time for the reports which does not have published time information.

what is an event?

We define an event as a significant thing that happened involving an agent(can be person or any thing having some role) at a determinable time.

Approach :

Event detection has been an important task for a long time. After studing many research works we came up with a new approach combining IR technique(N-Gram bursty words) and NLP approach(Semantic Role labeling). Using this approach we gave scores to individual sentences of each news article and took top scored sentences(say 202) of our dataset which we consider as events. We then filtered the duplicate events and ended up with 58 events. Finally, We further manually filtered events that are less relevant to GAStech disapperance and finalized 32 events. We used SIMILE project interactive tool for displaying timeline and description of events.

Fig 2.1: Events Timeline representing that the GAStech meeting is held from Jan 20, 2014 8:00am to 10:00am.

Events with relavant Articles :

·         Meeting at GAStech headquarters

160.txt, 618.txt, 711.txt, 764.txt

·         GAStech employees Will appear in the hour

597.txt, 348.txt

·         Abila Fire Department trucks responded to reports of a fire at the GAStech office.

453.txt, 673.txt, 326.txt

·         The fire alarm went off at the GAStech headquarters

167.txt,692.txt,

·         People are evacuating the building

10.txt, 537.txt

·         A helicopter leaves from the roof of the building

453.txt, 763.txt

·         The fire department has have just arrived

326.txt, 70.txt, 563.txt

·         A GAStech employee reports that the fire alarm was pulled in response to a bomb threat.

283.txt, 710.txt, 458.txt, 557.txt

·         Announcement that building is clear and it was a false alarm

78.txt, 299.txt, 215.txt, 540.txt, 292.txt,

·         No appearance of the GAStech executives and Sanjorge

697.txt

·         The police of Abila has arrived to the centers of GAStech

355.txt, 806.txt

·         Two police squad cars have arrived

625.txt, 490.txt

·         Confirmation of missing of GAStech employees

522.txt, 828.txt

·         Kronos government officials arrived at the GAStech headquarters.

805.txt, 811.txt

·         Abila airport confirms two private jets left today

633.txt, 718.txt, 721.txt

·         A GAStech employee statement suspecting men dressed in black.

660.txt, 253.txt

·         A coordinator of GAStech informs that people in black was suppliers.

395.txt, 368.txt

·         The suppliers for the reunion of the first breakfast have been freed.

417.txt, 322.txt

·         Media reports are coming in that a number of GAStech employees were kidnapped.

87.txt, 94.txt

·         Two private jets left Abila airport today with fourteen to sixteen passengers between the two.

429.txt, 633.txt, 718.txt, 721.txt

·         Confirmation of Airport officials on one of the private jets

592.txt

·         Kronos police spokesman has released a statement there is no indication that the missing individuals have left the island of Kronos.

567.txt, 817.txt

·         There are approximately fourteen individuals inspiegati between the staff of GAStech

313.txt, 485.txt

·         A government spokesman stated that the GAStech employees had been kidnapped

172.txt, 386.txt

·         Tethys officials have arrived in Kronos

637.txt

·         The civil employees of the airport have confirmed the arrival of a classified jet from tethys

118.txt, 744.txt

·         The news conference of unit of the police of Abila previewed for the 9:00

30.txt, 418.txt

·         Abila police revised the number of missing GAStech employees to ten.

276.txt, 693.txt, 676.txt,

·         CEO Sten Sanjorge is not among the missing GAStech employees

556.txt, 624.txt

·         A very important message to the people of Kronos.

110.txt, 793.txt

·         RANSOM DEMANDS FROM POK

219.txt, 822.txt, 261.txt, 310.txt, 824.txt, 178.txt

Fig 2.2: Displaying the description of a event.
Fig 2.3: Displaying all events in a single image using R

 

 

 

 

MC1.3Identify at least two possible explanations why the GAStech employees may be missing. What evidence do you have to support each of these explanations? Please limit your response to no more than three additional images and 200 words.

Approach :

For the possibilities of GAStech employees disappearance, we concentrated only on the speculations that are being made after disappearance. Using bag of words which represent speculations we collected about 20 news articles from the dataset. We got 6 clusters when we clustered those 20 news reports representing 6 unique speculations that are being made as shown in Figure 3.1. We then examined each cluster on the basis of the above events(Answer 1.2) and concluded the two possible explanations of disappearance.

Fig 3.1: Clusters based on the speculations made on GAStech disappearance

Solution :

If you closely examine the helicopter episode(articles 633.txt, 718.txt, 721.txt), we can notice that first plan fled from the GAStech building (articles 453.txt and 763.txt) whose passengers appeared to be in hurry and second plane from an unknown location and those passengers were very relaxed than first group of passengers. But in article(592.txt) "Airport officials have confirmed that they were from a private company, and that this company was not GAStech". So GAStech disapperance have nothing to do with second plane. By using above provided approach we hypothetically provide explanations around first plane.

Possibility 1 :

POK kidnapping GAStech employees as its clearly mentioned that POK demanded for ransom

Evidence Articles : 107.txt, 140.txt, 326.txt, 397.txt, 494.txt, 49.txt, 712.txt, 818.txt

Possibility 2 :

GAStech executives have fled away from the country with their fortune and money

Evidence Articles : 280.txt, 616.txt

Fig 3.2: Interactive tool having Specultion Clusters with their evidence news reports.